Cross-language Projection of Dependency Trees with Constrained Partial Parsing for Tree-to-Tree Machine Translation
نویسندگان
چکیده
Tree-to-tree machine translation (MT) that utilizes syntactic parse trees on both source and target sides suffers from the non-isomorphism of the parse trees due to parsing errors and the difference of annotation criterion between the two languages. In this paper, we present a method that projects dependency parse trees from the language side that has a high quality parser, to the side that has a low quality parser, to improve the isomorphism of the parse trees. We first project a part of the dependencies with high confidence to make a partial parse tree, and then complement the remaining dependencies with partial parsing constrained by the already projected dependencies. MT experiments verify the effectiveness of our proposed method.
منابع مشابه
Cross-language Projection of Dependency Trees for Tree-to-tree Machine Translation
Syntax-based machine translation (MT) is an attractive approach for introducing additional linguistic knowledge in corpus-based MT. Previous studies have shown that treeto-string and string-to-tree translation models perform better than tree-to-tree translation models since tree-to-tree models require two high quality parsers on the source as well as the target language side. In practice, high ...
متن کاملEffective Constituent Projection across Languages
We describe an effective constituent projection strategy, where constituent projection is performed on the basis of dependency projection. Especially, a novel measurement is proposed to evaluate the candidate projected constituents for a target language sentence, and a PCFG-style parsing procedure is then used to search for the most probable projected constituent tree. Experiments show that, th...
متن کاملEncoder-Decoder Shift-Reduce Syntactic Parsing
Encoder-decoder neural networks have been used for many NLP tasks, such as neural machine translation. They have also been applied to constituent parsing by using bracketed tree structures as a target language, translating input sentences into syntactic trees. A more commonly used method to linearize syntactic trees is the shift-reduce system, which uses a sequence of transition-actions to buil...
متن کاملA Dependency Based Statistical Translation Model
We present a translation model based on dependency trees. The model adopts a treeto-string approach and extends PhraseBased translation (PBT) by using the dependency tree of the source sentence for selecting translation options and for reordering them. Decoding is done by translating each node in the tree and combining its translations with those of its head in alternative orders with respect t...
متن کاملEfficient Convolution Kernels for Dependency and Constituent Syntactic Trees
In this paper, we provide a study on the use of tree kernels to encode syntactic parsing information in natural language learning. In particular, we propose a new convolution kernel, namely the Partial Tree (PT) kernel, to fully exploit dependency trees. We also propose an efficient algorithm for its computation which is futhermore sped-up by applying the selection of tree nodes with non-null k...
متن کامل